gh-128013: fix data race in PyUnicode_AsUTF8AndSize on free-threading#128021
Conversation
87bdaea to
0f692ce
Compare
colesbury
left a comment
There was a problem hiding this comment.
I think there's still a bug where PyUnicode_UTF8() is checked outside the lock, but the condition may change once the lock is acquired (because some other thread filled in the utf8 field).
I think we should refactor out the check into something like unicode_ensure_utf8 that does the double-checked locking:
static int
unicode_ensure_utf8(PyObject *unicode)
{
int err = 0;
if (PyUnicode_UTF8(unicode) == NULL) {
Py_BEGIN_CRITICAL_SECTION(unicode);
if (PyUnicode_UTF8(unicode) == NULL) {
err = unicode_fill_utf8(unicode);
}
Py_END_CRITICAL_SECTION();
}
return err;
}unicode_fill_utf8 should assert that the critical section is held.
|
I wrote PR gh-128061 "Convert unicodeobject.c macros to functions" to prepare the code for this change. |
|
I have updated the PR to use the new static inline functions and it now uses acquire/release semantics for utf8 member. I have tested the reproducer from issue and now there aren't any data races AFAICS. |
colesbury
left a comment
There was a problem hiding this comment.
Overall looks good, I think there's just one issue in _PyUnicode_CheckConsistency.
|
The failure looks unrelated: |
|
GH-128417 is a backport of this pull request to the 3.13 branch. |
Uh oh!
There was an error while loading. Please reload this page.